Overview

Dataset statistics

Number of variables24
Number of observations69958
Missing cells236314
Missing cells (%)14.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.8 MiB
Average record size in memory192.0 B

Variable types

Text3
Categorical8
Numeric10
DateTime3

Alerts

ACCOUNT_TYPE is highly overall correlated with OWNERSHIP_TYPEHigh correlation
ACTUAL_PAYMT_AMT is highly overall correlated with AMOUNT_OVERDUE and 4 other fieldsHigh correlation
ACTUAL_ROI is highly overall correlated with CURRENT_BALANCE and 2 other fieldsHigh correlation
AMOUNT_OVERDUE is highly overall correlated with ACTUAL_PAYMT_AMT and 4 other fieldsHigh correlation
COLLATERALVALUE is highly overall correlated with ACTUAL_PAYMT_AMT and 3 other fieldsHigh correlation
CURRENT_BALANCE is highly overall correlated with ACTUAL_PAYMT_AMT and 6 other fieldsHigh correlation
DATE_REPORTED_AND_CERTIFIED is highly overall correlated with OCCUPATION_TYPE and 2 other fieldsHigh correlation
EMI_AMOUNT is highly overall correlated with ACTUAL_PAYMT_AMT and 4 other fieldsHigh correlation
HIGH_CREDIT_OR_SANCTIONED_AMOUNT is highly overall correlated with ACTUAL_PAYMT_AMT and 5 other fieldsHigh correlation
LOAN_CLASSIFICATION is highly overall correlated with AMOUNT_OVERDUE and 1 other fieldsHigh correlation
OCCUPATION_TYPE is highly overall correlated with DATE_REPORTED_AND_CERTIFIED and 3 other fieldsHigh correlation
OWNERSHIP_TYPE is highly overall correlated with ACCOUNT_TYPE and 3 other fieldsHigh correlation
PAYMENT_HISTORY_END_DATE is highly overall correlated with OWNERSHIP_TYPEHigh correlation
PAYMENT_HISTORY_START_DATE is highly overall correlated with DATE_REPORTED_AND_CERTIFIED and 2 other fieldsHigh correlation
REPAYMENT_TENURE is highly overall correlated with CURRENT_BALANCEHigh correlation
Reported_Date is highly overall correlated with DATE_REPORTED_AND_CERTIFIED and 2 other fieldsHigh correlation
TU_SCORE is highly overall correlated with LOAN_CLASSIFICATIONHigh correlation
PAYMENT_HISTORY_START_DATE is highly imbalanced (> 99.9%)Imbalance
DATE_REPORTED_AND_CERTIFIED is highly imbalanced (> 99.9%)Imbalance
Reported_Date is highly imbalanced (> 99.9%)Imbalance
ACTUAL_PAYMT_AMT has 4217 (6.0%) missing valuesMissing
EMI_AMOUNT has 49766 (71.1%) missing valuesMissing
REPAYMENT_TENURE has 37176 (53.1%) missing valuesMissing
AMOUNT_OVERDUE has 60089 (85.9%) missing valuesMissing
PAYMENT_HISTORY_2 has 39673 (56.7%) missing valuesMissing
COLLATERALVALUE has 22968 (32.8%) missing valuesMissing
DATE_OF_LAST_PAYMENT has 3312 (4.7%) missing valuesMissing
OCCUPATION_TYPE has 19113 (27.3%) missing valuesMissing
HIGH_CREDIT_OR_SANCTIONED_AMOUNT is highly skewed (γ1 = 70.97189492)Skewed
CURRENT_BALANCE is highly skewed (γ1 = 78.0388996)Skewed
ACTUAL_PAYMT_AMT is highly skewed (γ1 = 112.6517969)Skewed
EMI_AMOUNT is highly skewed (γ1 = 35.71864192)Skewed
AMOUNT_OVERDUE is highly skewed (γ1 = 58.2778575)Skewed
COLLATERALVALUE is highly skewed (γ1 = 39.33065288)Skewed
ID has unique valuesUnique
LOAN_CLASSIFICATION has 60089 (85.9%) zerosZeros

Reproduction

Analysis started2024-03-03 07:21:44.942725
Analysis finished2024-03-03 07:22:45.655556
Duration1 minute and 0.71 seconds
Software versionydata-profiling vv4.6.5
Download configurationconfig.json

Variables

ID
Text

UNIQUE 

Distinct69958
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:46.081367image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length11
Median length10
Mean length10.089125
Min length10

Characters and Unicode

Total characters705815
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69958 ?
Unique (%)100.0%

Sample

1st rowA002338349
2nd rowA002000537
3rd rowA002421579
4th rowA002152345
5th rowA001952834
ValueCountFrequency (%)
a002338349 1
 
< 0.1%
a001137499 1
 
< 0.1%
b000113967 1
 
< 0.1%
a001558607 1
 
< 0.1%
a000155334 1
 
< 0.1%
a001091986 1
 
< 0.1%
a001132923 1
 
< 0.1%
a002421104 1
 
< 0.1%
a000936177 1
 
< 0.1%
a002232156 1
 
< 0.1%
Other values (69948) 69948
> 99.9%
2024-03-03T07:22:46.889290image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 218836
31.0%
1 72906
 
10.3%
A 55027
 
7.8%
2 54033
 
7.7%
3 44719
 
6.3%
4 42499
 
6.0%
6 40396
 
5.7%
8 39626
 
5.6%
5 39623
 
5.6%
7 39556
 
5.6%
Other values (11) 58594
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 631543
89.5%
Uppercase Letter 71974
 
10.2%
Connector Punctuation 2298
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 218836
34.7%
1 72906
 
11.5%
2 54033
 
8.6%
3 44719
 
7.1%
4 42499
 
6.7%
6 40396
 
6.4%
8 39626
 
6.3%
5 39623
 
6.3%
7 39556
 
6.3%
9 39349
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
A 55027
76.5%
B 10699
 
14.9%
J 2016
 
2.8%
K 2016
 
2.8%
D 1911
 
2.7%
E 126
 
0.2%
H 68
 
0.1%
C 60
 
0.1%
F 49
 
0.1%
G 2
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 2298
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 633841
89.8%
Latin 71974
 
10.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 218836
34.5%
1 72906
 
11.5%
2 54033
 
8.5%
3 44719
 
7.1%
4 42499
 
6.7%
6 40396
 
6.4%
8 39626
 
6.3%
5 39623
 
6.3%
7 39556
 
6.2%
9 39349
 
6.2%
Latin
ValueCountFrequency (%)
A 55027
76.5%
B 10699
 
14.9%
J 2016
 
2.8%
K 2016
 
2.8%
D 1911
 
2.7%
E 126
 
0.2%
H 68
 
0.1%
C 60
 
0.1%
F 49
 
0.1%
G 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 705815
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 218836
31.0%
1 72906
 
10.3%
A 55027
 
7.8%
2 54033
 
7.7%
3 44719
 
6.3%
4 42499
 
6.0%
6 40396
 
5.7%
8 39626
 
5.6%
5 39623
 
5.6%
7 39556
 
5.6%
Other values (11) 58594
 
8.3%

ACCOUNT_TYPE
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
Housing Loan
36133 
Personal Loan
15704 
Property Loan
10857 
Business Loan
7264 

Length

Max length13
Median length12
Mean length12.483504
Min length12

Characters and Unicode

Total characters873321
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHousing Loan
2nd rowHousing Loan
3rd rowHousing Loan
4th rowHousing Loan
5th rowHousing Loan

Common Values

ValueCountFrequency (%)
Housing Loan 36133
51.6%
Personal Loan 15704
22.4%
Property Loan 10857
 
15.5%
Business Loan 7264
 
10.4%

Length

2024-03-03T07:22:47.196786image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:22:47.477507image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
loan 69958
50.0%
housing 36133
25.8%
personal 15704
 
11.2%
property 10857
 
7.8%
business 7264
 
5.2%

Most occurring characters

ValueCountFrequency (%)
o 132652
15.2%
n 129059
14.8%
a 85662
9.8%
s 73629
8.4%
69958
8.0%
L 69958
8.0%
u 43397
 
5.0%
i 43397
 
5.0%
r 37418
 
4.3%
H 36133
 
4.1%
Other values (8) 152058
17.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 663447
76.0%
Uppercase Letter 139916
 
16.0%
Space Separator 69958
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 132652
20.0%
n 129059
19.5%
a 85662
12.9%
s 73629
11.1%
u 43397
 
6.5%
i 43397
 
6.5%
r 37418
 
5.6%
g 36133
 
5.4%
e 33825
 
5.1%
l 15704
 
2.4%
Other values (3) 32571
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
L 69958
50.0%
H 36133
25.8%
P 26561
 
19.0%
B 7264
 
5.2%
Space Separator
ValueCountFrequency (%)
69958
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 803363
92.0%
Common 69958
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 132652
16.5%
n 129059
16.1%
a 85662
10.7%
s 73629
9.2%
L 69958
8.7%
u 43397
 
5.4%
i 43397
 
5.4%
r 37418
 
4.7%
H 36133
 
4.5%
g 36133
 
4.5%
Other values (7) 115925
14.4%
Common
ValueCountFrequency (%)
69958
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 873321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 132652
15.2%
n 129059
14.8%
a 85662
9.8%
s 73629
8.4%
69958
8.0%
L 69958
8.0%
u 43397
 
5.0%
i 43397
 
5.0%
r 37418
 
4.3%
H 36133
 
4.1%
Other values (8) 152058
17.4%

HIGH_CREDIT_OR_SANCTIONED_AMOUNT
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct48767
Distinct (%)69.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1354749.4
Minimum4282
Maximum5.4 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:47.741426image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum4282
5-th percentile25000
Q1364466.5
median939368.5
Q31578010.5
95-th percentile3795019.3
Maximum5.4 × 108
Range5.3999572 × 108
Interquartile range (IQR)1213544

Descriptive statistics

Standard deviation3656262.8
Coefficient of variation (CV)2.6988481
Kurtosis8381.8407
Mean1354749.4
Median Absolute Deviation (MAD)614155.5
Skewness70.971895
Sum9.4775558 × 1010
Variance1.3368257 × 1013
MonotonicityNot monotonic
2024-03-03T07:22:48.064463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50000 848
 
1.2%
20000 794
 
1.1%
25000 641
 
0.9%
100000 570
 
0.8%
30000 559
 
0.8%
300000 552
 
0.8%
14400 502
 
0.7%
40000 417
 
0.6%
500000 408
 
0.6%
200000 359
 
0.5%
Other values (48757) 64308
91.9%
ValueCountFrequency (%)
4282 1
 
< 0.1%
4362 1
 
< 0.1%
5000 102
0.1%
5412 1
 
< 0.1%
6165 1
 
< 0.1%
6230 1
 
< 0.1%
6684 1
 
< 0.1%
7000 50
0.1%
7739 1
 
< 0.1%
7751 1
 
< 0.1%
ValueCountFrequency (%)
540000000 1
< 0.1%
300000000 1
< 0.1%
272500000 1
< 0.1%
253922969 1
< 0.1%
215000000 1
< 0.1%
140486389 1
< 0.1%
116018923 1
< 0.1%
114999999 1
< 0.1%
99964530 1
< 0.1%
74099999 1
< 0.1%
Distinct4554
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
Minimum2002-05-23 00:00:00
Maximum2023-12-11 00:00:00
2024-03-03T07:22:48.334189image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:48.664705image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

CURRENT_BALANCE
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct64261
Distinct (%)91.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1131129.2
Minimum1067
Maximum5.1756481 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:48.956352image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1067
5-th percentile18036.85
Q1228626.5
median705422.5
Q31331960
95-th percentile3419083.6
Maximum5.1756481 × 108
Range5.1756375 × 108
Interquartile range (IQR)1103333.5

Descriptive statistics

Standard deviation3398279.9
Coefficient of variation (CV)3.0043251
Kurtosis9621.9409
Mean1131129.2
Median Absolute Deviation (MAD)524175
Skewness78.0389
Sum7.9131538 × 1010
Variance1.1548306 × 1013
MonotonicityNot monotonic
2024-03-03T07:22:49.257525image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 226
 
0.3%
20000 213
 
0.3%
14400 180
 
0.3%
11585 169
 
0.2%
8738 146
 
0.2%
50000 131
 
0.2%
10000 126
 
0.2%
35000 123
 
0.2%
16761 117
 
0.2%
45000 112
 
0.2%
Other values (64251) 68415
97.8%
ValueCountFrequency (%)
1067 1
< 0.1%
1129 1
< 0.1%
1214 1
< 0.1%
1334 1
< 0.1%
1522 1
< 0.1%
1649 1
< 0.1%
1740 1
< 0.1%
1757 1
< 0.1%
1836 1
< 0.1%
1839 1
< 0.1%
ValueCountFrequency (%)
517564813 1
< 0.1%
295884568 1
< 0.1%
276506532 1
< 0.1%
235061727 1
< 0.1%
207133491 1
< 0.1%
112981953 1
< 0.1%
104874827 1
< 0.1%
103162557 1
< 0.1%
63777411 1
< 0.1%
62576118 1
< 0.1%

ACTUAL_PAYMT_AMT
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct27311
Distinct (%)41.5%
Missing4217
Missing (%)6.0%
Infinite0
Infinite (%)0.0%
Mean16407.994
Minimum1
Maximum13485322
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:49.560594image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile500
Q14433
median10374
Q318294
95-th percentile40765
Maximum13485322
Range13485321
Interquartile range (IQR)13861

Descriptive statistics

Standard deviation72253.913
Coefficient of variation (CV)4.4035798
Kurtosis19087.909
Mean16407.994
Median Absolute Deviation (MAD)6586
Skewness112.6518
Sum1.078678 × 109
Variance5.2206279 × 109
MonotonicityNot monotonic
2024-03-03T07:22:49.873140image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500 3199
 
4.6%
3000 417
 
0.6%
1000 395
 
0.6%
1180 376
 
0.5%
2980 330
 
0.5%
2360 183
 
0.3%
1770 176
 
0.3%
590 137
 
0.2%
3469 112
 
0.2%
3788 109
 
0.2%
Other values (27301) 60307
86.2%
(Missing) 4217
 
6.0%
ValueCountFrequency (%)
1 24
< 0.1%
2 13
< 0.1%
3 6
 
< 0.1%
4 6
 
< 0.1%
5 4
 
< 0.1%
6 3
 
< 0.1%
7 5
 
< 0.1%
8 8
 
< 0.1%
9 1
 
< 0.1%
10 3
 
< 0.1%
ValueCountFrequency (%)
13485322 1
< 0.1%
5350961 1
< 0.1%
3701180 1
< 0.1%
3316197 1
< 0.1%
2600000 1
< 0.1%
2363270 1
< 0.1%
2052500 1
< 0.1%
1730000 1
< 0.1%
1700000 1
< 0.1%
1672474 1
< 0.1%

EMI_AMOUNT
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct14364
Distinct (%)71.1%
Missing49766
Missing (%)71.1%
Infinite0
Infinite (%)0.0%
Mean18721.456
Minimum15
Maximum3700000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:50.189292image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile2077
Q16477.5
median12005.5
Q319913
95-th percentile43406.35
Maximum3700000
Range3699985
Interquartile range (IQR)13435.5

Descriptive statistics

Standard deviation61515.261
Coefficient of variation (CV)3.285816
Kurtosis1723.0905
Mean18721.456
Median Absolute Deviation (MAD)6269.5
Skewness35.718642
Sum3.7802364 × 108
Variance3.7841273 × 109
MonotonicityNot monotonic
2024-03-03T07:22:50.481963image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2980 93
 
0.1%
3468 66
 
0.1%
3788 52
 
0.1%
163 28
 
< 0.1%
4630 16
 
< 0.1%
82 15
 
< 0.1%
100000 14
 
< 0.1%
10696 13
 
< 0.1%
841 13
 
< 0.1%
210 13
 
< 0.1%
Other values (14354) 19869
 
28.4%
(Missing) 49766
71.1%
ValueCountFrequency (%)
15 2
 
< 0.1%
17 2
 
< 0.1%
20 1
 
< 0.1%
21 8
< 0.1%
28 4
< 0.1%
30 3
 
< 0.1%
33 3
 
< 0.1%
35 1
 
< 0.1%
36 2
 
< 0.1%
41 2
 
< 0.1%
ValueCountFrequency (%)
3700000 1
< 0.1%
3316197 1
< 0.1%
3291763 1
< 0.1%
2537872 1
< 0.1%
1988087 1
< 0.1%
1700000 1
< 0.1%
1579911 1
< 0.1%
1327942 1
< 0.1%
1200000 1
< 0.1%
1155815 1
< 0.1%

REPAYMENT_TENURE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct539
Distinct (%)1.6%
Missing37176
Missing (%)53.1%
Infinite0
Infinite (%)0.0%
Mean165.99085
Minimum10
Maximum736
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:50.782889image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile36
Q190
median160
Q3233
95-th percentile325
Maximum736
Range726
Interquartile range (IQR)143

Descriptive statistics

Standard deviation93.270067
Coefficient of variation (CV)0.56189885
Kurtosis0.47146466
Mean165.99085
Median Absolute Deviation (MAD)71
Skewness0.63847916
Sum5441512
Variance8699.3054
MonotonicityNot monotonic
2024-03-03T07:22:51.071504image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
180 1483
 
2.1%
48 1194
 
1.7%
240 1006
 
1.4%
120 428
 
0.6%
36 401
 
0.6%
300 398
 
0.6%
251 367
 
0.5%
222 356
 
0.5%
260 306
 
0.4%
259 229
 
0.3%
Other values (529) 26614
38.0%
(Missing) 37176
53.1%
ValueCountFrequency (%)
10 1
 
< 0.1%
12 1
 
< 0.1%
14 1
 
< 0.1%
15 1
 
< 0.1%
17 23
 
< 0.1%
18 64
0.1%
19 41
0.1%
20 64
0.1%
21 52
0.1%
22 53
0.1%
ValueCountFrequency (%)
736 1
< 0.1%
704 2
< 0.1%
702 1
< 0.1%
692 1
< 0.1%
688 1
< 0.1%
644 1
< 0.1%
643 2
< 0.1%
640 1
< 0.1%
628 1
< 0.1%
624 1
< 0.1%

LOAN_CLASSIFICATION
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct222
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.417036
Minimum0
Maximum900
Zeros60089
Zeros (%)85.9%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:51.368008image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile90.15
Maximum900
Range900
Interquartile range (IQR)0

Descriptive statistics

Standard deviation134.43582
Coefficient of variation (CV)4.569999
Kurtosis31.784868
Mean29.417036
Median Absolute Deviation (MAD)0
Skewness5.6285548
Sum2057957
Variance18072.991
MonotonicityNot monotonic
2024-03-03T07:22:51.649247image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 60089
85.9%
21 2225
 
3.2%
900 1219
 
1.7%
26 1198
 
1.7%
52 1072
 
1.5%
57 430
 
0.6%
82 429
 
0.6%
1 200
 
0.3%
87 198
 
0.3%
113 164
 
0.2%
Other values (212) 2734
 
3.9%
ValueCountFrequency (%)
0 60089
85.9%
1 200
 
0.3%
2 54
 
0.1%
3 27
 
< 0.1%
4 17
 
< 0.1%
5 23
 
< 0.1%
6 14
 
< 0.1%
7 12
 
< 0.1%
8 12
 
< 0.1%
9 14
 
< 0.1%
ValueCountFrequency (%)
900 1219
1.7%
874 17
 
< 0.1%
843 24
 
< 0.1%
812 25
 
< 0.1%
782 31
 
< 0.1%
756 1
 
< 0.1%
751 33
 
< 0.1%
746 1
 
< 0.1%
726 3
 
< 0.1%
721 21
 
< 0.1%

AMOUNT_OVERDUE
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct9139
Distinct (%)92.6%
Missing60089
Missing (%)85.9%
Infinite0
Infinite (%)0.0%
Mean364730.89
Minimum1
Maximum3.1050141 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:51.952172image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1286
Q110926
median27500
Q3114104
95-th percentile1244236.8
Maximum3.1050141 × 108
Range3.1050141 × 108
Interquartile range (IQR)103178

Descriptive statistics

Standard deviation3912815.7
Coefficient of variation (CV)10.727953
Kurtosis4239.5977
Mean364730.89
Median Absolute Deviation (MAD)21904
Skewness58.277857
Sum3.5995292 × 109
Variance1.5310127 × 1013
MonotonicityNot monotonic
2024-03-03T07:22:52.235509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 95
 
0.1%
2 19
 
< 0.1%
2980 11
 
< 0.1%
3467 7
 
< 0.1%
163 7
 
< 0.1%
5960 4
 
< 0.1%
3787 4
 
< 0.1%
4 4
 
< 0.1%
2437 4
 
< 0.1%
42 4
 
< 0.1%
Other values (9129) 9710
 
13.9%
(Missing) 60089
85.9%
ValueCountFrequency (%)
1 95
0.1%
2 19
 
< 0.1%
3 3
 
< 0.1%
4 4
 
< 0.1%
5 2
 
< 0.1%
9 1
 
< 0.1%
12 1
 
< 0.1%
13 2
 
< 0.1%
15 1
 
< 0.1%
18 3
 
< 0.1%
ValueCountFrequency (%)
310501413 1
< 0.1%
145915781 1
< 0.1%
95509314 1
< 0.1%
53082500 1
< 0.1%
48790532 1
< 0.1%
45016680 1
< 0.1%
38052500 1
< 0.1%
31824969 1
< 0.1%
29272652 1
< 0.1%
27040270 1
< 0.1%
Distinct6173
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:52.550820image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length54
Median length48
Mean length25.195389
Min length1

Characters and Unicode

Total characters1762619
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5163 ?
Unique (%)7.4%

Sample

1st row000000000000000000000000000000000000052052052021XXX000
2nd row000000000000000000000000000000000000000000000000XXX000
3rd row000000000000000000000000000000000000000000000000XXX000
4th row0
5th row539570570540509506507507507507509509509479478478XXX450
ValueCountFrequency (%)
0 33791
48.3%
000000000000000000000000000000000000000000000000xxx000 17378
24.8%
000000000000000000000000000000000000000000000000xxxxxx 2494
 
3.6%
900900900900900900900900900900900900900900900900xxx900 702
 
1.0%
000000000000000000000000000000000000xxx000000 538
 
0.8%
000000000000000000000xxxxxxxxxxxxxxxxxxxxxxxx000xxx000 439
 
0.6%
000000000000000000000000000000000000xxx000 361
 
0.5%
021022021000000000000000000000000000000000000000xxx000 317
 
0.5%
000000000000000000000000000000000000xxx000000000xxx000 307
 
0.4%
000000000000000000000000000000000000000000000021xxx000 198
 
0.3%
Other values (6163) 13433
 
19.2%
2024-03-03T07:22:53.152068image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1437965
81.6%
X 124296
 
7.1%
2 45378
 
2.6%
5 28748
 
1.6%
1 26663
 
1.5%
9 22196
 
1.3%
3 18165
 
1.0%
8 14580
 
0.8%
4 12336
 
0.7%
7 12132
 
0.7%
Other values (4) 20160
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1628831
92.4%
Uppercase Letter 127460
 
7.2%
Other Punctuation 3164
 
0.2%
Math Symbol 3164
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1437965
88.3%
2 45378
 
2.8%
5 28748
 
1.8%
1 26663
 
1.6%
9 22196
 
1.4%
3 18165
 
1.1%
8 14580
 
0.9%
4 12336
 
0.8%
7 12132
 
0.7%
6 10668
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
X 124296
97.5%
E 3164
 
2.5%
Other Punctuation
ValueCountFrequency (%)
. 3164
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3164
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1635159
92.8%
Latin 127460
 
7.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1437965
87.9%
2 45378
 
2.8%
5 28748
 
1.8%
1 26663
 
1.6%
9 22196
 
1.4%
3 18165
 
1.1%
8 14580
 
0.9%
4 12336
 
0.8%
7 12132
 
0.7%
6 10668
 
0.7%
Other values (2) 6328
 
0.4%
Latin
ValueCountFrequency (%)
X 124296
97.5%
E 3164
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1762619
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1437965
81.6%
X 124296
 
7.1%
2 45378
 
2.6%
5 28748
 
1.6%
1 26663
 
1.5%
9 22196
 
1.3%
3 18165
 
1.0%
8 14580
 
0.8%
4 12336
 
0.7%
7 12132
 
0.7%
Other values (4) 20160
 
1.1%

PAYMENT_HISTORY_2
Text

MISSING 

Distinct616
Distinct (%)2.0%
Missing39673
Missing (%)56.7%
Memory size546.7 KiB
2024-03-03T07:22:53.591525image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length54
Median length1
Mean length4.5951461
Min length1

Characters and Unicode

Total characters139164
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique356 ?
Unique (%)1.2%

Sample

1st row3.00E+19
2nd row0
3rd row0
4th row4.50E+53
5th row1.80E+53
ValueCountFrequency (%)
0 21172
69.9%
3.00e+52 1369
 
4.5%
9.01e+53 786
 
2.6%
6.01e+52 677
 
2.2%
9.01e+52 420
 
1.4%
3.00e+16 214
 
0.7%
3.00e+49 184
 
0.6%
1.20e+53 171
 
0.6%
3.00e+43 157
 
0.5%
6.00e+52 155
 
0.5%
Other values (606) 4980
 
16.4%
2024-03-03T07:22:55.252498image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 66158
47.5%
X 23067
 
16.6%
. 7206
 
5.2%
E 7206
 
5.2%
+ 7206
 
5.2%
3 6956
 
5.0%
5 5591
 
4.0%
2 4409
 
3.2%
1 4406
 
3.2%
6 2351
 
1.7%
Other values (4) 4608
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 94479
67.9%
Uppercase Letter 30273
 
21.8%
Other Punctuation 7206
 
5.2%
Math Symbol 7206
 
5.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 66158
70.0%
3 6956
 
7.4%
5 5591
 
5.9%
2 4409
 
4.7%
1 4406
 
4.7%
6 2351
 
2.5%
9 2078
 
2.2%
4 1560
 
1.7%
8 569
 
0.6%
7 401
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
X 23067
76.2%
E 7206
 
23.8%
Other Punctuation
ValueCountFrequency (%)
. 7206
100.0%
Math Symbol
ValueCountFrequency (%)
+ 7206
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 108891
78.2%
Latin 30273
 
21.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 66158
60.8%
. 7206
 
6.6%
+ 7206
 
6.6%
3 6956
 
6.4%
5 5591
 
5.1%
2 4409
 
4.0%
1 4406
 
4.0%
6 2351
 
2.2%
9 2078
 
1.9%
4 1560
 
1.4%
Other values (2) 970
 
0.9%
Latin
ValueCountFrequency (%)
X 23067
76.2%
E 7206
 
23.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 139164
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 66158
47.5%
X 23067
 
16.6%
. 7206
 
5.2%
E 7206
 
5.2%
+ 7206
 
5.2%
3 6956
 
5.0%
5 5591
 
4.0%
2 4409
 
3.2%
1 4406
 
3.2%
6 2351
 
1.7%
Other values (4) 4608
 
3.3%

OWNERSHIP_TYPE
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
Joint
52535 
Individual
17423 

Length

Max length10
Median length5
Mean length6.2452471
Min length5

Characters and Unicode

Total characters436905
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJoint
2nd rowJoint
3rd rowJoint
4th rowJoint
5th rowJoint

Common Values

ValueCountFrequency (%)
Joint 52535
75.1%
Individual 17423
 
24.9%

Length

2024-03-03T07:22:55.718391image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:22:56.111380image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
joint 52535
75.1%
individual 17423
 
24.9%

Most occurring characters

ValueCountFrequency (%)
i 87381
20.0%
n 69958
16.0%
J 52535
12.0%
o 52535
12.0%
t 52535
12.0%
d 34846
 
8.0%
I 17423
 
4.0%
v 17423
 
4.0%
u 17423
 
4.0%
a 17423
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 366947
84.0%
Uppercase Letter 69958
 
16.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 87381
23.8%
n 69958
19.1%
o 52535
14.3%
t 52535
14.3%
d 34846
 
9.5%
v 17423
 
4.7%
u 17423
 
4.7%
a 17423
 
4.7%
l 17423
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
J 52535
75.1%
I 17423
 
24.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 436905
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 87381
20.0%
n 69958
16.0%
J 52535
12.0%
o 52535
12.0%
t 52535
12.0%
d 34846
 
8.0%
I 17423
 
4.0%
v 17423
 
4.0%
u 17423
 
4.0%
a 17423
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 436905
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 87381
20.0%
n 69958
16.0%
J 52535
12.0%
o 52535
12.0%
t 52535
12.0%
d 34846
 
8.0%
I 17423
 
4.0%
v 17423
 
4.0%
u 17423
 
4.0%
a 17423
 
4.0%

COLLATERALVALUE
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct27826
Distinct (%)59.2%
Missing22968
Missing (%)32.8%
Infinite0
Infinite (%)0.0%
Mean3586770.9
Minimum1
Maximum7.26147 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:56.347052image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile900000
Q11560000
median2327047
Q33697500
95-th percentile9186928
Maximum7.26147 × 108
Range7.26147 × 108
Interquartile range (IQR)2137500

Descriptive statistics

Standard deviation7908637.5
Coefficient of variation (CV)2.2049464
Kurtosis2635.6579
Mean3586770.9
Median Absolute Deviation (MAD)925557
Skewness39.330653
Sum1.6854236 × 1011
Variance6.2546547 × 1013
MonotonicityNot monotonic
2024-03-03T07:22:56.631191image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1800000 250
 
0.4%
1500000 238
 
0.3%
1200000 236
 
0.3%
2000000 196
 
0.3%
3000000 159
 
0.2%
1400000 157
 
0.2%
900000 145
 
0.2%
1000000 145
 
0.2%
1300000 144
 
0.2%
1600000 143
 
0.2%
Other values (27816) 45177
64.6%
(Missing) 22968
32.8%
ValueCountFrequency (%)
1 1
< 0.1%
42000 1
< 0.1%
50000 1
< 0.1%
126000 1
< 0.1%
142694 1
< 0.1%
150350 1
< 0.1%
165256 1
< 0.1%
180950 1
< 0.1%
193680 1
< 0.1%
200000 2
< 0.1%
ValueCountFrequency (%)
726147000 1
< 0.1%
544527618 1
< 0.1%
503958000 1
< 0.1%
379341298 1
< 0.1%
368579500 1
< 0.1%
280002020 1
< 0.1%
260357600 1
< 0.1%
253882000 1
< 0.1%
240451200 1
< 0.1%
191787218 1
< 0.1%

TU_SCORE
Real number (ℝ)

HIGH CORRELATION 

Distinct284
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean739.75801
Minimum533
Maximum851
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:22:56.934324image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum533
5-th percentile639
Q1717
median753
Q3774
95-th percentile798
Maximum851
Range318
Interquartile range (IQR)57

Descriptive statistics

Standard deviation49.101644
Coefficient of variation (CV)0.066375279
Kurtosis1.1869819
Mean739.75801
Median Absolute Deviation (MAD)25
Skewness-1.1953222
Sum51751991
Variance2410.9715
MonotonicityNot monotonic
2024-03-03T07:22:57.212983image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
746 2374
 
3.4%
780 2373
 
3.4%
757 1713
 
2.4%
774 1568
 
2.2%
778 1210
 
1.7%
771 1163
 
1.7%
769 1023
 
1.5%
786 989
 
1.4%
776 965
 
1.4%
777 905
 
1.3%
Other values (274) 55675
79.6%
ValueCountFrequency (%)
533 2
 
< 0.1%
537 1
 
< 0.1%
538 1
 
< 0.1%
544 2
 
< 0.1%
546 4
 
< 0.1%
548 2
 
< 0.1%
555 17
< 0.1%
559 5
 
< 0.1%
560 8
< 0.1%
562 12
< 0.1%
ValueCountFrequency (%)
851 1
 
< 0.1%
843 2
 
< 0.1%
840 2
 
< 0.1%
839 1
 
< 0.1%
838 7
< 0.1%
837 3
< 0.1%
836 1
 
< 0.1%
835 2
 
< 0.1%
834 5
< 0.1%
833 5
< 0.1%

PAYMENT_HISTORY_START_DATE
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
01-11-2023
69955 
01-12-2023
 
2
01-01-2024
 
1

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters699580
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row01-11-2023
2nd row01-11-2023
3rd row01-11-2023
4th row01-11-2023
5th row01-11-2023

Common Values

ValueCountFrequency (%)
01-11-2023 69955
> 99.9%
01-12-2023 2
 
< 0.1%
01-01-2024 1
 
< 0.1%

Length

2024-03-03T07:22:57.486101image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:22:57.732778image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
01-11-2023 69955
> 99.9%
01-12-2023 2
 
< 0.1%
01-01-2024 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 209871
30.0%
2 139918
20.0%
0 139917
20.0%
- 139916
20.0%
3 69957
 
10.0%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 559664
80.0%
Dash Punctuation 139916
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 209871
37.5%
2 139918
25.0%
0 139917
25.0%
3 69957
 
12.5%
4 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 139916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 699580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 209871
30.0%
2 139918
20.0%
0 139917
20.0%
- 139916
20.0%
3 69957
 
10.0%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 699580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 209871
30.0%
2 139918
20.0%
0 139917
20.0%
- 139916
20.0%
3 69957
 
10.0%
4 1
 
< 0.1%

PAYMENT_HISTORY_END_DATE
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
01-12-2020
26781 
01-10-2023
5060 
01-11-2023
3431 
01-09-2023
3261 
01-07-2023
2865 
Other values (29)
28560 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters699580
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row01-12-2020
2nd row01-12-2020
3rd row01-12-2020
4th row01-12-2022
5th row01-12-2020

Common Values

ValueCountFrequency (%)
01-12-2020 26781
38.3%
01-10-2023 5060
 
7.2%
01-11-2023 3431
 
4.9%
01-09-2023 3261
 
4.7%
01-07-2023 2865
 
4.1%
01-03-2023 2759
 
3.9%
01-08-2023 2592
 
3.7%
01-08-2022 2567
 
3.7%
01-12-2022 2318
 
3.3%
01-04-2023 2256
 
3.2%
Other values (24) 16068
23.0%

Length

2024-03-03T07:22:57.971902image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
01-12-2020 26781
38.3%
01-10-2023 5060
 
7.2%
01-11-2023 3431
 
4.9%
01-09-2023 3261
 
4.7%
01-07-2023 2865
 
4.1%
01-03-2023 2759
 
3.9%
01-08-2023 2592
 
3.7%
01-08-2022 2567
 
3.7%
01-12-2022 2318
 
3.3%
01-04-2023 2256
 
3.2%
Other values (24) 16068
23.0%

Most occurring characters

ValueCountFrequency (%)
0 202801
29.0%
2 182733
26.1%
- 139916
20.0%
1 118281
16.9%
3 34768
 
5.0%
8 5258
 
0.8%
9 5059
 
0.7%
7 2987
 
0.4%
5 2832
 
0.4%
4 2744
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 559664
80.0%
Dash Punctuation 139916
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 202801
36.2%
2 182733
32.7%
1 118281
21.1%
3 34768
 
6.2%
8 5258
 
0.9%
9 5059
 
0.9%
7 2987
 
0.5%
5 2832
 
0.5%
4 2744
 
0.5%
6 2201
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 139916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 699580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 202801
29.0%
2 182733
26.1%
- 139916
20.0%
1 118281
16.9%
3 34768
 
5.0%
8 5258
 
0.8%
9 5059
 
0.7%
7 2987
 
0.4%
5 2832
 
0.4%
4 2744
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 699580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 202801
29.0%
2 182733
26.1%
- 139916
20.0%
1 118281
16.9%
3 34768
 
5.0%
8 5258
 
0.8%
9 5059
 
0.7%
7 2987
 
0.4%
5 2832
 
0.4%
4 2744
 
0.4%

DATE_REPORTED_AND_CERTIFIED
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
30-11-2023
69955 
05-12-2023
 
2
02-01-2024
 
1

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters699580
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row30-11-2023
2nd row30-11-2023
3rd row30-11-2023
4th row30-11-2023
5th row30-11-2023

Common Values

ValueCountFrequency (%)
30-11-2023 69955
> 99.9%
05-12-2023 2
 
< 0.1%
02-01-2024 1
 
< 0.1%

Length

2024-03-03T07:22:58.208236image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:22:58.448978image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
30-11-2023 69955
> 99.9%
05-12-2023 2
 
< 0.1%
02-01-2024 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 139919
20.0%
0 139917
20.0%
- 139916
20.0%
1 139913
20.0%
3 139912
20.0%
5 2
 
< 0.1%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 559664
80.0%
Dash Punctuation 139916
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 139919
25.0%
0 139917
25.0%
1 139913
25.0%
3 139912
25.0%
5 2
 
< 0.1%
4 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 139916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 699580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 139919
20.0%
0 139917
20.0%
- 139916
20.0%
1 139913
20.0%
3 139912
20.0%
5 2
 
< 0.1%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 699580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 139919
20.0%
0 139917
20.0%
- 139916
20.0%
1 139913
20.0%
3 139912
20.0%
5 2
 
< 0.1%
4 1
 
< 0.1%

DATE_OF_LAST_PAYMENT
Date

MISSING 

Distinct468
Distinct (%)0.7%
Missing3312
Missing (%)4.7%
Memory size546.7 KiB
Minimum2009-01-15 00:00:00
Maximum2023-12-28 00:00:00
2024-03-03T07:22:58.679378image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:58.999120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Reported_Date
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
30-11-2023
69955 
05-12-2023
 
2
02-01-2024
 
1

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters699580
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row30-11-2023
2nd row30-11-2023
3rd row30-11-2023
4th row30-11-2023
5th row30-11-2023

Common Values

ValueCountFrequency (%)
30-11-2023 69955
> 99.9%
05-12-2023 2
 
< 0.1%
02-01-2024 1
 
< 0.1%

Length

2024-03-03T07:22:59.245731image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:22:59.490786image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
30-11-2023 69955
> 99.9%
05-12-2023 2
 
< 0.1%
02-01-2024 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 139919
20.0%
0 139917
20.0%
- 139916
20.0%
1 139913
20.0%
3 139912
20.0%
5 2
 
< 0.1%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 559664
80.0%
Dash Punctuation 139916
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 139919
25.0%
0 139917
25.0%
1 139913
25.0%
3 139912
25.0%
5 2
 
< 0.1%
4 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 139916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 699580
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 139919
20.0%
0 139917
20.0%
- 139916
20.0%
1 139913
20.0%
3 139912
20.0%
5 2
 
< 0.1%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 699580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 139919
20.0%
0 139917
20.0%
- 139916
20.0%
1 139913
20.0%
3 139912
20.0%
5 2
 
< 0.1%
4 1
 
< 0.1%
Distinct15335
Distinct (%)21.9%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
Minimum1927-04-24 00:00:00
Maximum2005-11-02 00:00:00
2024-03-03T07:22:59.725894image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:23:00.023808image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

OCCUPATION_TYPE
Categorical

HIGH CORRELATION  MISSING 

Distinct5
Distinct (%)< 0.1%
Missing19113
Missing (%)27.3%
Memory size546.7 KiB
SALARIED
28677 
SENP
11963 
OTHERS
5272 
SEP
4932 
R18
 
1

Length

Max length8
Median length8
Mean length6.366388
Min length3

Characters and Unicode

Total characters323699
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSALARIED
2nd rowSALARIED
3rd rowSALARIED
4th rowSALARIED
5th rowSALARIED

Common Values

ValueCountFrequency (%)
SALARIED 28677
41.0%
SENP 11963
17.1%
OTHERS 5272
 
7.5%
SEP 4932
 
7.0%
R18 1
 
< 0.1%
(Missing) 19113
27.3%

Length

2024-03-03T07:23:00.280103image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:23:00.541463image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
salaried 28677
56.4%
senp 11963
23.5%
others 5272
 
10.4%
sep 4932
 
9.7%
r18 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A 57354
17.7%
S 50844
15.7%
E 50844
15.7%
R 33950
10.5%
L 28677
8.9%
I 28677
8.9%
D 28677
8.9%
P 16895
 
5.2%
N 11963
 
3.7%
O 5272
 
1.6%
Other values (4) 10546
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 323697
> 99.9%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 57354
17.7%
S 50844
15.7%
E 50844
15.7%
R 33950
10.5%
L 28677
8.9%
I 28677
8.9%
D 28677
8.9%
P 16895
 
5.2%
N 11963
 
3.7%
O 5272
 
1.6%
Other values (2) 10544
 
3.3%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
8 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 323697
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 57354
17.7%
S 50844
15.7%
E 50844
15.7%
R 33950
10.5%
L 28677
8.9%
I 28677
8.9%
D 28677
8.9%
P 16895
 
5.2%
N 11963
 
3.7%
O 5272
 
1.6%
Other values (2) 10544
 
3.3%
Common
ValueCountFrequency (%)
1 1
50.0%
8 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 323699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 57354
17.7%
S 50844
15.7%
E 50844
15.7%
R 33950
10.5%
L 28677
8.9%
I 28677
8.9%
D 28677
8.9%
P 16895
 
5.2%
N 11963
 
3.7%
O 5272
 
1.6%
Other values (4) 10546
 
3.3%

GENDER
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size546.7 KiB
Male
47354 
Female
22545 
Other
 
59

Length

Max length6
Median length4
Mean length4.6453729
Min length4

Characters and Unicode

Total characters324981
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowMale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Male 47354
67.7%
Female 22545
32.2%
Other 59
 
0.1%

Length

2024-03-03T07:23:00.800657image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-03T07:23:01.087018image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
male 47354
67.7%
female 22545
32.2%
other 59
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 92503
28.5%
a 69899
21.5%
l 69899
21.5%
M 47354
14.6%
F 22545
 
6.9%
m 22545
 
6.9%
O 59
 
< 0.1%
t 59
 
< 0.1%
h 59
 
< 0.1%
r 59
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 255023
78.5%
Uppercase Letter 69958
 
21.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 92503
36.3%
a 69899
27.4%
l 69899
27.4%
m 22545
 
8.8%
t 59
 
< 0.1%
h 59
 
< 0.1%
r 59
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
M 47354
67.7%
F 22545
32.2%
O 59
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 324981
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 92503
28.5%
a 69899
21.5%
l 69899
21.5%
M 47354
14.6%
F 22545
 
6.9%
m 22545
 
6.9%
O 59
 
< 0.1%
t 59
 
< 0.1%
h 59
 
< 0.1%
r 59
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 324981
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 92503
28.5%
a 69899
21.5%
l 69899
21.5%
M 47354
14.6%
F 22545
 
6.9%
m 22545
 
6.9%
O 59
 
< 0.1%
t 59
 
< 0.1%
h 59
 
< 0.1%
r 59
 
< 0.1%

ACTUAL_ROI
Real number (ℝ)

HIGH CORRELATION 

Distinct794
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.010569
Minimum0
Maximum45
Zeros7
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size546.7 KiB
2024-03-03T07:23:01.334445image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q111.25
median13.35
Q318
95-th percentile34
Maximum45
Range45
Interquartile range (IQR)6.75

Descriptive statistics

Standard deviation6.9835257
Coefficient of variation (CV)0.43618222
Kurtosis1.7186182
Mean16.010569
Median Absolute Deviation (MAD)2.38
Skewness1.6127186
Sum1120067.4
Variance48.769631
MonotonicityNot monotonic
2024-03-03T07:23:01.613209image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.5 2076
 
3.0%
10 1964
 
2.8%
10.92 1930
 
2.8%
13.8 1768
 
2.5%
36 1494
 
2.1%
11.22 1435
 
2.1%
13.5 1351
 
1.9%
35 1232
 
1.8%
13 1143
 
1.6%
20 1050
 
1.5%
Other values (784) 54515
77.9%
ValueCountFrequency (%)
0 7
< 0.1%
7.5 4
 
< 0.1%
7.75 5
< 0.1%
8 1
 
< 0.1%
8.25 3
 
< 0.1%
8.3 1
 
< 0.1%
8.5 12
< 0.1%
8.6 1
 
< 0.1%
8.62 5
< 0.1%
8.65 2
 
< 0.1%
ValueCountFrequency (%)
45 1
 
< 0.1%
44.25 6
< 0.1%
44 1
 
< 0.1%
43.9 1
 
< 0.1%
43.6 1
 
< 0.1%
43.54 1
 
< 0.1%
43 1
 
< 0.1%
42.9 11
< 0.1%
42.6 1
 
< 0.1%
42.58 1
 
< 0.1%

Interactions

2024-03-03T07:22:39.174456image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:14.028289image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:17.574206image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:20.095015image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:22.845604image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:25.403489image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:28.229033image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:31.541736image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:34.094254image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:36.635499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:39.412006image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:14.289536image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:17.823991image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:20.536494image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:23.102876image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:25.661490image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:28.570967image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:31.810843image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:34.343361image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:36.886574image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:39.651628image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:14.551197image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:18.072224image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:20.798424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:23.357561image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:25.910878image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:28.935180image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:32.068375image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:34.580954image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:37.142216image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:39.906101image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:14.945120image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:18.328896image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:21.061588image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:23.624389image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:26.160195image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:29.603280image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:32.317097image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:34.841681image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:37.392422image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:40.485141image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:15.365942image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:18.581987image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:21.318330image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:23.880547image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:26.420121image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:29.970352image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:32.561998image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:35.110327image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:37.648823image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:40.882026image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:15.706353image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:18.852687image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:21.587279image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:24.142046image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:26.689747image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:30.368746image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:32.830844image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:35.368239image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:37.906629image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:41.225245image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:16.073203image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:19.087712image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:21.826370image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:24.374804image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:26.929258image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:30.583003image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:33.082119image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:35.595100image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:38.148248image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:41.554882image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:16.452049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:19.341275image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:22.078980image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:24.646961image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:27.193421image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:30.837063image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:33.325513image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:35.857680image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:38.397455image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:41.972570image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:16.857548image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:19.605339image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:22.333034image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:24.911639image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:27.459822image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:31.089063image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:33.595061image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:36.123978image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:38.657433image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:42.332803image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:17.246324image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:19.857996image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:22.594398image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:25.163390image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:27.860051image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:31.327523image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:33.838314image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:36.376906image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-03-03T07:22:38.920825image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-03-03T07:23:01.862790image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ACCOUNT_TYPEACTUAL_PAYMT_AMTACTUAL_ROIAMOUNT_OVERDUECOLLATERALVALUECURRENT_BALANCEDATE_REPORTED_AND_CERTIFIEDEMI_AMOUNTGENDERHIGH_CREDIT_OR_SANCTIONED_AMOUNTLOAN_CLASSIFICATIONOCCUPATION_TYPEOWNERSHIP_TYPEPAYMENT_HISTORY_END_DATEPAYMENT_HISTORY_START_DATEREPAYMENT_TENUREReported_DateTU_SCORE
ACCOUNT_TYPE1.000-0.1590.255-0.2190.214-0.1700.009-0.1950.154-0.183-0.0110.3510.9350.4030.009-0.0310.009-0.025
ACTUAL_PAYMT_AMT-0.1591.000-0.2370.5220.5470.6140.0000.7870.0000.606-0.0150.0000.0000.0000.0000.0320.0000.054
ACTUAL_ROI0.255-0.2371.000-0.269-0.086-0.5690.000-0.3250.118-0.6270.0370.2360.7100.1980.000-0.1760.000-0.253
AMOUNT_OVERDUE-0.2190.522-0.2691.0000.4100.6440.0000.6790.0000.6260.8040.0000.0020.0000.0000.0570.000-0.489
COLLATERALVALUE0.2140.547-0.0860.4101.0000.6960.0000.7140.0000.762-0.0050.0150.0460.0000.0000.0790.000-0.033
CURRENT_BALANCE-0.1700.614-0.5690.6440.6961.0000.0000.8030.0000.9370.0670.0060.0070.0000.0000.5500.0000.040
DATE_REPORTED_AND_CERTIFIED0.0090.0000.0000.0000.0000.0001.000-0.0140.000-0.004-0.0031.0000.0000.0811.0000.0141.0000.009
EMI_AMOUNT-0.1950.787-0.3250.6790.7140.803-0.0141.0000.0000.8130.0180.0090.0180.0000.0000.0670.0000.022
GENDER0.1540.0000.1180.0000.0000.0000.0000.0001.000-0.1780.0320.2730.2160.0740.000-0.0030.000-0.108
HIGH_CREDIT_OR_SANCTIONED_AMOUNT-0.1830.606-0.6270.6260.7620.937-0.0040.813-0.1781.0000.0610.0070.0070.0000.0000.2890.0000.077
LOAN_CLASSIFICATION-0.011-0.0150.0370.804-0.0050.067-0.0030.0180.0320.0611.0000.0820.0860.0780.0000.1360.000-0.515
OCCUPATION_TYPE0.3510.0000.2360.0000.0150.0061.0000.0090.2730.0070.0821.0000.5370.2391.0000.1591.000-0.128
OWNERSHIP_TYPE0.9350.0000.7100.0020.0460.0070.0000.0180.2160.0070.0860.5371.0000.5030.0000.0570.0000.118
PAYMENT_HISTORY_END_DATE0.4030.0000.1980.0000.0000.0000.0810.0000.0740.0000.0780.2390.5031.0000.0810.0370.0810.029
PAYMENT_HISTORY_START_DATE0.0090.0000.0000.0000.0000.0001.0000.0000.0000.0000.0001.0000.0000.0811.000-0.0041.000-0.003
REPAYMENT_TENURE-0.0310.032-0.1760.0570.0790.5500.0140.067-0.0030.2890.1360.1590.0570.037-0.0041.0000.000-0.145
Reported_Date0.0090.0000.0000.0000.0000.0001.0000.0000.0000.0000.0001.0000.0000.0811.0000.0001.0000.009
TU_SCORE-0.0250.054-0.253-0.489-0.0330.0400.0090.022-0.1080.077-0.515-0.1280.1180.029-0.003-0.1450.0091.000

Missing values

2024-03-03T07:22:43.252221image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-03T07:22:44.207093image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-03T07:22:45.285019image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

IDACCOUNT_TYPEHIGH_CREDIT_OR_SANCTIONED_AMOUNTDATE_OPENEDCURRENT_BALANCEACTUAL_PAYMT_AMTEMI_AMOUNTREPAYMENT_TENURELOAN_CLASSIFICATIONAMOUNT_OVERDUEPAYMENT_HISTORY_1PAYMENT_HISTORY_2OWNERSHIP_TYPECOLLATERALVALUETU_SCOREPAYMENT_HISTORY_START_DATEPAYMENT_HISTORY_END_DATEDATE_REPORTED_AND_CERTIFIEDDATE_OF_LAST_PAYMENTReported_DateDATE_OF_BIRTHOCCUPATION_TYPEGENDERACTUAL_ROI
0A002338349Housing Loan81851713-06-20146407428627.08627.0142.00NaN000000000000000000000000000000000000052052052021XXX0003.00E+19Joint2086000.068101-11-202301-12-202030-11-202330-11-202330-11-202321-05-1971SALARIEDMale12.32
1A002000537Housing Loan124375505-12-2012949446500.012210.0156.00NaN000000000000000000000000000000000000000000000000XXX0000Joint1536000.078401-11-202301-12-202030-11-202330-11-202330-11-202312-11-1984SALARIEDFemale12.42
2A002421579Housing Loan182642216-03-2017129684916036.016036.0148.00NaN000000000000000000000000000000000000000000000000XXX0000Joint3587700.074801-11-202301-12-202030-11-202311-11-202330-11-202301-08-1986SALARIEDMale10.87
3A002152345Housing Loan184791624-12-2022172425626996.0NaNNaN0NaN0NaNJoint3049600.078601-11-202301-12-202230-11-202305-11-202330-11-202322-07-1969NaNFemale10.00
4A001952834Housing Loan231838621-10-2014179093725131.0NaN116.0539431551.0539570570540509506507507507507509509509479478478XXX4504.50E+53Joint2785400.066401-11-202301-12-202030-11-202327-11-202330-11-202328-10-1974SALARIEDMale11.97
5A002239370Property Loan116387021-04-201795937012544.0NaN158.02112533.0021022052053052052052052050050000000000000000264XXX2101.80E+53Joint2369812.071001-11-202301-12-202030-11-202330-11-202330-11-202328-10-1966SALARIEDMale12.97
6A000936177Housing Loan240031801-03-20232385404500.0NaNNaN0NaN0NaNJoint3068676.077401-11-202301-03-202330-11-202315-11-202330-11-202305-05-1990NaNMale14.25
7A001137499Personal Loan13800026-02-2023968696756.06756.0NaN0NaN0NaNIndividualNaN71401-11-202301-02-202330-11-202305-11-202330-11-202307-09-1987SALARIEDMale15.99
8A002421104Housing Loan143606312-01-20151302519NaN17247.097.09001223863.0900900900900900900900900900900900900900900900900XXX9009.01E+53Joint1844612.059901-11-202301-12-202030-11-202301-07-202230-11-202315-07-1982SENPMale13.82
9A001132923Housing Loan88999227-07-20216041487421.07420.0240.00NaN000000000000000000000000000000000000000000000000XXXXXX000000000XXXXXX000000000000000000Joint1176000.077201-11-202301-07-202130-11-202305-11-202330-11-202301-01-1987NaNMale14.50
IDACCOUNT_TYPEHIGH_CREDIT_OR_SANCTIONED_AMOUNTDATE_OPENEDCURRENT_BALANCEACTUAL_PAYMT_AMTEMI_AMOUNTREPAYMENT_TENURELOAN_CLASSIFICATIONAMOUNT_OVERDUEPAYMENT_HISTORY_1PAYMENT_HISTORY_2OWNERSHIP_TYPECOLLATERALVALUETU_SCOREPAYMENT_HISTORY_START_DATEPAYMENT_HISTORY_END_DATEDATE_REPORTED_AND_CERTIFIEDDATE_OF_LAST_PAYMENTReported_DateDATE_OF_BIRTHOCCUPATION_TYPEGENDERACTUAL_ROI
69948A001905850Housing Loan188284927-08-2022187247720505.0NaN240.00NaN0NaNJoint3055400.077401-11-202301-08-202230-11-202305-11-202330-11-202311-05-1991NaNFemale11.50
69949A001397677Housing Loan66950615-02-20181985174866.0NaN67.00NaN000000000000000000000000000000000000000000000000XXX0000Joint1881424.078001-11-202301-12-202030-11-202312-11-202330-11-202315-04-1950SENPMale11.47
69950B000198178Personal Loan50400014-09-202348382814755.0NaNNaN0NaN0NaNIndividualNaN77101-11-202301-09-202330-11-202310-11-202330-11-202315-05-1996SALARIEDMale12.00
69951A001175214Business Loan69113210-10-202256375221042.020665.0NaN0NaN000000000000000000000000000000000000XXX000NaNJointNaN70601-11-202301-10-202230-11-202305-11-202330-11-202301-01-1964NaNFemale19.00
69952B000092060Personal Loan700005-10-202353301884.01884.0NaN0NaN0NaNIndividualNaN73401-11-202301-10-202330-11-202330-11-202330-11-202318-05-1996SEPFemale36.00
69953A000195582Property Loan260000015-03-2018238143730762.030762.0162.02130762.0021000000000000000000000000000000000000000000000XXX0000Joint16335000.074401-11-202301-12-202030-11-202301-11-202330-11-202305-02-1989SENPFemale12.97
69954A001735798Business Loan101618826-08-202279048629851.0NaN48.00NaN0NaNJointNaN67301-11-202301-08-202230-11-202305-11-202330-11-202322-08-1992NaNMale18.00
69955A002435973Business Loan33547522-03-202228110610388.0NaN48.024082468.0240210179149118087116085055024000000000000000000XXXXXX0JointNaN56701-11-202301-03-202230-11-202319-06-202330-11-202313-05-1987NaNMale21.00
69956A000027755Housing Loan158516012-09-2015125523516739.0NaN143.00NaN000000000000000000000000000000000000000000000000XXX0000Joint9506700.079801-11-202301-12-202030-11-202314-11-202330-11-202301-06-1971SALARIEDMale12.07
69957A000500401Housing Loan54297317-06-20233572172878.0NaNNaN0NaN0NaNJoint1781027.075701-11-202301-06-202330-11-202305-11-202330-11-202303-02-1989OTHERSFemale12.85